reconstruction cost
ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning
Independent Components Analysis (ICA) and its variants have been successfully used for unsupervised feature learning. However, standard ICA requires an orthonoramlity constraint to be enforced, which makes it difficult to learn overcomplete features. In addition, ICA is sensitive to whitening. These properties make it challenging to scale ICA to high dimensional data. In this paper, we propose a robust soft reconstruction cost for ICA that allows us to learn highly overcomplete sparse features even on unwhitened data.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Ontario > Toronto (0.04)
ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning
Independent Components Analysis (ICA) and its variants have been successfully used for unsupervised feature learning. However, standard ICA requires an orthonoramlity constraint to be enforced, which makes it difficult to learn overcomplete features. In addition, ICA is sensitive to whitening. These properties make it challenging to scale ICA to high dimensional data. In this paper, we propose a robust soft reconstruction cost for ICA that allows us to learn highly overcomplete sparse features even on unwhitened data.
ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning
Le, Quoc V., Karpenko, Alexandre, Ngiam, Jiquan, Ng, Andrew Y.
Independent Components Analysis (ICA) and its variants have been successfully used for unsupervised feature learning. However, standard ICA requires an orthonoramlity constraint to be enforced, which makes it difficult to learn overcomplete features. In addition, ICA is sensitive to whitening. These properties make it challenging to scale ICA to high dimensional data. In this paper, we propose a robust soft reconstruction cost for ICA that allows us to learn highly overcomplete sparse features even on unwhitened data.
Large Scale Adversarial Representation Learning
Donahue, Jeff, Simonyan, Karen
Adversarially trained generative models (GANs) have recently achieved compelling image synthesis results. But despite early successes in using GANs for unsupervised representation learning, they have since been superseded by approaches based on self-supervision. In this work we show that progress in image generation quality translates to substantially improved representation learning performance. Our approach, BigBiGAN, builds upon the state-of-the-art BigGAN model, extending it to representation learning by adding an encoder and modifying the discriminator. We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as in unconditional image generation.
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Ladder Networks for Semi-Supervised Hyperspectral Image Classification
We used the Ladder Network [Rasmus et al. (2015)] to perform Hyperspectral Image Classification in a semi-supervised setting. The Ladder Network distinguishes itself from other semi-supervised methods by jointly optimizing a supervised and unsupervised cost. In many settings this has proven to be more successful than other semi-supervised techniques, such as pretraining using unlabeled data. We furthermore show that the convolutional Ladder Network outperforms most of the current techniques used in hyperspectral image classification and achieves new state-of-the-art performance on the Pavia University dataset given only 5 labeled data points per class.
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Italy (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.83)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Implicit Autoencoders
In this paper, we describe the "implicit autoencoder" (IAE), a generative autoencoder in which both the generative path and the recognition path are parametrized by implicit distributions. We use two generative adversarial networks to define the reconstruction and the regularization cost functions of the implicit autoencoder, and derive the learning rules based on maximum-likelihood learning. Using implicit distributions allows us to learn more expressive posterior and conditional likelihood distributions for the autoencoder. Learning an expressive conditional likelihood distribution enables the latent code to only capture the abstract and high-level information of the data, while the remaining information is captured by the implicit conditional likelihood distribution. For example, we show that implicit autoencoders can disentangle the global and local information, and perform deterministic or stochastic reconstructions of the images. We further show that implicit autoencoders can disentangle discrete underlying factors of variation from the continuous factors in an unsupervised fashion, and perform clustering and semi-supervised learning.
Active Lifelong Learning With "Watchdog"
Sun, Gan (Shenyang Institute of Automation, Chinese Academy of Sciences) | Cong, Yang (University of Chinese Academy of Sciences) | Xu, Xiaowei (Shenyang Institute of Automation, Chinese Academy of Sciences)
Lifelong learning intends to learn new consecutive tasks depending on previously accumulated experiences, i.e., knowledge library. However, the knowledge among different new coming tasks are imbalance. Therefore, in this paper, we try to mimic an effective "human cognition" strategy by actively sorting the importance of new tasks in the process of unknown-to-known and selecting to learn the important tasks with more information preferentially. To achieve this, we consider to assess the importance of the new coming task, i.e., unknown or not, as an outlier detection issue, and design a hierarchical dictionary learning model consisting of two-level task descriptors to sparse reconstruct each task with the l0 norm constraint. The new coming tasks are sorted depending on the sparse reconstruction score in descending order, and the task with high reconstruction score will be permitted to pass, where this mechanism is called as "watchdog." Next, the knowledge library of the lifelong learning framework encode the selected task by transferring previous knowledge, and then can also update itself with knowledge from both previously learned task and current task automatically. For model optimization, the alternating direction method is employed to solve our model and converges to a fixed point. Extensive experiments on both benchmark datasets and our own dataset demonstrate the effectiveness of our proposed model especially in task selection and dictionary learning.
- Asia > Middle East > Jordan (0.05)
- North America > United States > California (0.04)
- North America > United States > Arkansas (0.04)
- Asia > China > Liaoning Province > Shenyang (0.04)
- Instructional Material (0.57)
- Research Report (0.46)
Canonical Divergence Analysis
Nguyen, Hoang-Vu, Vreeken, Jilles
We aim to analyze the relation between two random vectors that may potentially have both different number of attributes as well as realizations, and which may even not have a joint distribution. This problem arises in many practical domains, including biology and architecture. Existing techniques assume the vectors to have the same domain or to be jointly distributed, and hence are not applicable. To address this, we propose Canonical Divergence Analysis (CDA). We introduce three instantiations, each of which permits practical implementation. Extensive empirical evaluation shows the potential of our method.